feat(cli): print session summary at end of exec runs by Sayt-0 · Pull Request #3278 · docker/docker-agent

Sayt-0 · 2026-06-26T17:04:26Z

Summary

Adds an end-of-run session summary for non-interactive (--exec) runs. Until now --exec printed no usage recap, while the TUI only exposed the on-demand /cost dialog. The recap covers interaction stats (tool calls, success rate), per-model token usage, total cost, and cache savings, matching the request in #1345.

The data is derived from the existing session through a new Session.Stats() helper, so no new tracking pipeline or protocol events are introduced.

Sample output

─── Session summary ───
Session:       7532d5a9-92e9-4197-ae26-34855d7167f7
Duration:      1s
Requests:      2
Tool calls:    1 (1 ok, 0 failed)
Success rate:  100.0%

Model usage
MODEL           REQS   INPUT   CACHE READ   OUTPUT
openai/gpt-4o      2   1,600            0       20

Tokens:  1,600 in / 20 out
Cost:    $0.0042

Mapping to the request (#1345)

Requested field	Status
Session ID	done
Tool calls with success/failure split	done
Success rate	done
Per-model usage (requests, input, cache reads, output)	done
Token totals and cache savings	done
Wall-clock duration	done
Agent-active / API-time / tool-time split	deferred (needs runtime timing instrumentation)
User Agreement	dropped (ambiguous metric)

Design notes

The recap is written to stderr, so stdout stays machine-consumable. The existing exec stdout contract (asserted by the e2e TestExec_* tests) is unchanged.
It is skipped in JSON output mode, where structured events are emitted instead.
It is a no-op when the session recorded no measurable activity.
No observability stack is added. General observability remains the role of OTEL, consistent with the feedback on the earlier closed PR runtime: add session-level observability plumbing #1436.

Validation

Check	Result
`task build`	pass
`golangci-lint` on `pkg/session`, `pkg/cli`	0 issues
Unit tests (`pkg/session`, `pkg/cli`)	pass
e2e (`TestExec_*`)	pass (stdout unchanged)
Manual real run (replay cassette, plain and tool-call)	recap on stderr, stdout clean

Out of scope / possible follow-ups

API-time vs tool-time split (requires lightweight runtime timing).
A flag to suppress the recap (currently it can be silenced with 2>/dev/null).
Surfacing the same stats in JSON output mode.

Non-interactive runs previously emitted no usage recap; the TUI only had the on-demand /cost dialog. This adds an end-of-run summary covering the interaction (tool calls, success rate), per-model token usage, and total cost. The recap is derived from the existing session via a new Session.Stats() helper (tool-call success/failure counts, per-model token and cost breakdown, cache savings, duration). It is written to stderr so stdout stays machine-consumable, and is skipped in JSON output mode. Refs #1345

dgageot · 2026-06-26T17:07:18Z

@Sayt-0 add a --summary to enable this feature

docker-agent

Assessment: 🟡 NEEDS ATTENTION

One medium-severity issue found in the new Session.Stats() implementation. The overall PR structure is sound — stderr output, JSON mode guard, no-op for empty sessions, and correct token math are all well-implemented.

docker-agent · 2026-06-26T17:09:02Z

+			switch {
+			case item.IsMessage():
+				m := item.Message.Message
+				st.Cost += m.Cost


[MEDIUM] Potential cost double-counting in Stats() remote-mode fallback

st.Cost += m.Cost is accumulated unconditionally for every message during the walk (line 113), including assistant messages where m.Usage == nil. However, st.Requests is only incremented when m.Usage != nil. In remote mode, usage is stored in MessageUsageHistory rather than on the message object — so it is architecturally possible for an assistant message to carry m.Cost > 0 while m.Usage == nil.

When this happens:

The walk adds the message's cost to st.Cost (via line 113).

After the walk, st.Requests == 0 → the remote-mode fallback fires.

The fallback also adds r.Cost for the same call from MessageUsageHistory.

The cost is counted twice. The fix is either to skip the message-level cost accumulation when m.Usage == nil (since that cost will be covered by the fallback path), or to zero out st.Cost before entering the fallback loop:

// Option A — only accumulate m.Cost when usage is present: case chat.MessageRoleAssistant: if m.Usage != nil { st.Cost += m.Cost // move cost inside the usage guard st.Requests++ // ... } // Option B — reset before fallback: if st.Requests == 0 && len(s.MessageUsageHistory) > 0 { st.Cost = 0 // discard costs collected without usage data for _, r := range s.MessageUsageHistory { ... } }

Sayt-0 requested a review from a team as a code owner June 26, 2026 17:04

docker-agent reviewed Jun 26, 2026

View reviewed changes

Sayt-0 closed this Jun 26, 2026

Sayt-0 deleted the feat/exec-session-summary branch June 26, 2026 17:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cli): print session summary at end of exec runs#3278

feat(cli): print session summary at end of exec runs#3278
Sayt-0 wants to merge 1 commit into
mainfrom
feat/exec-session-summary

Sayt-0 commented Jun 26, 2026

Uh oh!

dgageot commented Jun 26, 2026

Uh oh!

docker-agent left a comment

Uh oh!

docker-agent Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Sayt-0 commented Jun 26, 2026

Summary

Sample output

Mapping to the request (#1345)

Design notes

Validation

Out of scope / possible follow-ups

Uh oh!

dgageot commented Jun 26, 2026

Uh oh!

docker-agent left a comment

Choose a reason for hiding this comment

Assessment: 🟡 NEEDS ATTENTION

Uh oh!

docker-agent Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants